智能论文笔记

The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns via Spotlights of Attention

Kazuki Irie , Róbert Csordás , Jürgen Schmidhuber

分类：机器学习

2022-02-11

通过梯度下降训练的神经网络（NNS）中的线性层可以表示为一个键值存储系统，该系统存储了所有训练数据点和初始权重，并在整个培训经验中使用不差的DOT注意产生输出。虽然自1960年代以来一直在技术上都知道，但先前的工作没有有效地研究了这种形式的NN的操作，大概是由于时间和空间的复杂性和不切实际的模型大小，它们都随着训练模式的数量线性增长，这些训练模式的数量是线性的可能很大。但是，这种双重配方可以通过检查相应的注意力重量直接可视化NN如何在测试时间使用训练模式。我们对小规模监督图像分类任务进行了实验，以单任务，多任务和持续的学习设置以及语言建模，并讨论这种观点的潜力和限制，以更好地理解和解释NNS如何利用培训模式。我们的代码是公开的。

translated by 谷歌翻译

A Modern Self-Referential Weight Matrix That Learns to Modify Itself

Kazuki Irie , Imanol Schlag , Róbert Csordás , Jürgen Schmidhuber

分类：机器学习

2022-02-11

神经网络（NN）的重量矩阵（WM）是其程序。许多传统NN的程序是通过梯度下降中的某些错误函数中学到的，然后保持固定。但是，在运行时可以继续迅速修改自身的WM。原则上，这样的NN可以学习元学习，并从递归自我改善的意义上学习meta-meta-learn来学习，等等。自从90年代以来，已经提出了NN架构可能能够实施这种行为的架构，但几乎没有实践研究。在这里，我们基于快速重量程序员和密切相关的线性变压器的最新成功进行重新审视。我们提出了一个可扩展的自我参照WM（SRWM），该WM（SRWM）学会使用外部产品和Delta Update规则来修改自身。我们通过程序生成的游戏环境评估了有监督的少数学习和多任务增强学习中的SRWM。我们的实验证明了拟议的SRWM的实际适用性和竞争性能。我们的代码是公开的。

translated by 谷歌翻译

Improving Baselines in the Wild

Kazuki Irie , Imanol Schlag , Róbert Csordás , Jürgen Schmidhuber

分类：机器学习 | 计算机视觉

2021-12-31

我们与最近发布的狂野基准分享我们的经验，这是一个致力于开发模型和培训策略的十个数据集的集合，这些策略对域班较强。几个实验产生了几个批判性观察，我们认为对任何未来的野外工作都是普遍的兴趣。我们的研究侧重于两个数据集：IWILDCAM和FMOW。我们展示（1）对每个评估度量进行单独的交叉验证对于两个数据集来说至关重要，（2）验证和测试性能之间的相关性可能使IWIndCAM的模型开发难以困难，（3）超级培训的次要变化困难 - 参数通过相对较大的边缘（主要是FMOW）来改善基线，（4）某些域和某些目标标签之间存在强烈的相关性（主要是IWINDCAM）之间存在强烈的相关性。据我们所知，尽管有明显的重要性，但这些数据集上没有关于这些观察结果的工作。我们的代码是公开的。

translated by 谷歌翻译

Training and Generating Neural Networks in Compressed Weight Space

Kazuki Irie , Jürgen Schmidhuber

分类：机器学习 | 自然语言处理

2021-12-31

一些神经网络的输入和/或输出是其他神经网的权重矩阵。重量矩阵的间接编码或端到端压缩可以有助于规模这些方法。我们的目标是开展关于该主题的讨论，从用于性格级语言建模的经常性神经网络开始，其权重矩阵由离散余弦变换编码。我们的快速重量形式使用经常性神经网络来参数化压缩的重量。我们在enwik8数据集上呈现实验结果。

translated by 谷歌翻译

An Exploratory Study of AI System Risk Assessment from the Lens of Data Distribution and Uncertainty

Zhijie Wang , Yuheng Huang , Lei Ma , Haruki Yokoyama , Susumu Tokumoto , Kazuki Munakata

分类：机器学习 | 人工智能

2022-12-13

Deep learning (DL) has become a driving force and has been widely adopted in many domains and applications with competitive performance. In practice, to solve the nontrivial and complicated tasks in real-world applications, DL is often not used standalone, but instead contributes as a piece of gadget of a larger complex AI system. Although there comes a fast increasing trend to study the quality issues of deep neural networks (DNNs) at the model level, few studies have been performed to investigate the quality of DNNs at both the unit level and the potential impacts on the system level. More importantly, it also lacks systematic investigation on how to perform the risk assessment for AI systems from unit level to system level. To bridge this gap, this paper initiates an early exploratory study of AI system risk assessment from both the data distribution and uncertainty angles to address these issues. We propose a general framework with an exploratory study for analyzing AI systems. After large-scale (700+ experimental configurations and 5000+ GPU hours) experiments and in-depth investigations, we reached a few key interesting findings that highlight the practical need and opportunities for more in-depth investigations into AI systems.

translated by 谷歌翻译

DDSupport: Language Learning Support System that Displays Differences and Distances from Model Speech

Kazuki Kawamura , Jun Rekimoto

分类：机器学习

2022-12-08

When beginners learn to speak a non-native language, it is difficult for them to judge for themselves whether they are speaking well. Therefore, computer-assisted pronunciation training systems are used to detect learner mispronunciations. These systems typically compare the user's speech with that of a specific native speaker as a model in units of rhythm, phonemes, or words and calculate the differences. However, they require extensive speech data with detailed annotations or can only compare with one specific native speaker. To overcome these problems, we propose a new language learning support system that calculates speech scores and detects mispronunciations by beginners based on a small amount of unannotated speech data without comparison to a specific person. The proposed system uses deep learning--based speech processing to display the pronunciation score of the learner's speech and the difference/distance between the learner's and a group of models' pronunciation in an intuitively visual manner. Learners can gradually improve their pronunciation by eliminating differences and shortening the distance from the model until they become sufficiently proficient. Furthermore, since the pronunciation score and difference/distance are not calculated compared to specific sentences of a particular model, users are free to study the sentences they wish to study. We also built an application to help non-native speakers learn English and confirmed that it can improve users' speech intelligibility.

translated by 谷歌翻译

Learning Locally, Communicating Globally: Reinforcement Learning of Multi-robot Task Allocation for Cooperative Transport

Kazuki Shibata , Tomohiko Jimbo , Tadashi Odashima , Keisuke Takeshita , Takamitsu Matsubara

分类：机器人

2022-12-06

We consider task allocation for multi-object transport using a multi-robot system, in which each robot selects one object among multiple objects with different and unknown weights. The existing centralized methods assume the number of robots and tasks to be fixed, which is inapplicable to scenarios that differ from the learning environment. Meanwhile, the existing distributed methods limit the minimum number of robots and tasks to a constant value, making them applicable to various numbers of robots and tasks. However, they cannot transport an object whose weight exceeds the load capacity of robots observing the object. To make it applicable to various numbers of robots and objects with different and unknown weights, we propose a framework using multi-agent reinforcement learning for task allocation. First, we introduce a structured policy model consisting of 1) predesigned dynamic task priorities with global communication and 2) a neural network-based distributed policy model that determines the timing for coordination. The distributed policy builds consensus on the high-priority object under local observations and selects cooperative or independent actions. Then, the policy is optimized by multi-agent reinforcement learning through trial and error. This structured policy of local learning and global communication makes our framework applicable to various numbers of robots and objects with different and unknown weights, as demonstrated by numerical simulations.

translated by 谷歌翻译

Deep reinforcement learning of event-triggered communication and consensus-based control for distributed cooperative transport

Kazuki Shibata , Tomohiko Jimbo , Takamitsu Matsubara

分类：机器人

2022-12-05

In this paper, we present a solution to a design problem of control strategies for multi-agent cooperative transport. Although existing learning-based methods assume that the number of agents is the same as that in the training environment, the number might differ in reality considering that the robots' batteries may completely discharge, or additional robots may be introduced to reduce the time required to complete a task. Therefore, it is crucial that the learned strategy be applicable to scenarios wherein the number of agents differs from that in the training environment. In this paper, we propose a novel multi-agent reinforcement learning framework of event-triggered communication and consensus-based control for distributed cooperative transport. The proposed policy model estimates the resultant force and torque in a consensus manner using the estimates of the resultant force and torque with the neighborhood agents. Moreover, it computes the control and communication inputs to determine when to communicate with the neighboring agents under local observations and estimates of the resultant force and torque. Therefore, the proposed framework can balance the control performance and communication savings in scenarios wherein the number of agents differs from that in the training environment. We confirm the effectiveness of our approach by using a maximum of eight and six robots in the simulations and experiments, respectively.

translated by 谷歌翻译

Efficient Joint Detection and Multiple Object Tracking with Spatially Aware Transformer

Siddharth Sagar Nijhawan , Leo Hoshikawa , Atsushi Irie , Masakazu Yoshimura , Junji Otsuka , Takeshi Ohashi

分类：计算机视觉

2022-11-09

We propose a light-weight and highly efficient Joint Detection and Tracking pipeline for the task of Multi-Object Tracking using a fully-transformer architecture. It is a modified version of TransTrack, which overcomes the computational bottleneck associated with its design, and at the same time, achieves state-of-the-art MOTA score of 73.20%. The model design is driven by a transformer based backbone instead of CNN, which is highly scalable with the input resolution. We also propose a drop-in replacement for Feed Forward Network of transformer encoder layer, by using Butterfly Transform Operation to perform channel fusion and depth-wise convolution to learn spatial context within the feature maps, otherwise missing within the attention maps of the transformer. As a result of our modifications, we reduce the overall model size of TransTrack by 58.73% and the complexity by 78.72%. Therefore, we expect our design to provide novel perspectives for architecture optimization in future research related to multi-object tracking.

translated by 谷歌翻译

Neural Graph Databases

Maciej Besta , Patrick Iff , Florian Scheidl , Kazuki Osawa , Nikoli Dryden , Michal Podstawski , Tiancheng Chen , Torsten Hoefler

分类：机器学习

2022-09-20

图形数据库（GDB）启用对非结构化，复杂，丰富且通常庞大的图形数据集的处理和分析。尽管GDB在学术界和行业中都具有很大的意义，但几乎没有努力将它们与图形神经网络（GNNS）的预测能力融为一体。在这项工作中，我们展示了如何无缝将几乎所有GNN模型与GDB的计算功能相结合。为此，我们观察到这些系统大多数是基于或支持的，称为标记的属性图（LPG）的图形数据模型，在该模型中，顶点和边缘可以任意复杂的标签和属性集。然后，我们开发LPG2VEC，这是一种编码器，将任意LPG数据集转换为可以与广泛的GNN类直接使用的表示形式，包括卷积，注意力，消息通话，甚至高阶或频谱模型。在我们的评估中，我们表明，LPG2VEC可以正确保留代表LPG标签和属性的丰富信息，并且与与图形相比，与与图形相比，它提高了预测的准确性，而不管有针对性的学习任务或使用过的GNN模型，多达34％没有LPG标签/属性。通常，LPG2VEC可以将最强大的GNN的预测能力与LPG模型中编码的全部信息范围相结合，为神经图数据库铺平了道路，这是一类系统，其中维护的数据的绝大复杂性将从现代和未来中受益图机学习方法。

translated by 谷歌翻译